Inducing structure in reward learning by learning features
نویسندگان
چکیده
Reward learning enables robots to learn adaptable behaviors from human input. Traditional methods model the reward as a linear function of hand-crafted features, but that requires specifying all relevant features priori, which is impossible for real-world tasks. To get around this issue, recent deep Inverse Reinforcement Learning (IRL) rewards directly raw state challenging because robot has implicitly are important and how combine them, simultaneously. Instead, we propose divide-and-conquer approach: focus input specifically on separately, only then them into reward. We introduce novel type teaching an algorithm utilizes it complex space. The can using demonstrations, corrections, or other frameworks. demonstrate our method in settings where have be learned scratch, well some known. By first focusing feature(s), decreases sample complexity improves generalization over IRL baseline. show experiments with physical 7-DoF manipulator, user study conducted simulated environment.
منابع مشابه
Learning Higher-Order Graph Structure with Features by Structure Penalty
In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn th...
متن کاملInducing Effective Pedagogical Strategies Using Learning Context Features
Effective pedagogical strategies are important for e-learning environments. While it is assumed that an effective learning environment should craft and adapt its actions to the user’s needs, it is often not clear how to do so. In this paper, we used a Natural Language Tutoring System named Cordillera and applied Reinforcement Learning (RL) to induce pedagogical strategies directly from pre-exis...
متن کاملReinforcement Learning by Comparing Immediate Reward
This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate rewards using a variation of Q-Learning algorithm. Unlike the conventional Q-Learning, the proposed algorithm compares current reward with immediate reward of past move and work accordingly. Relative reward based Q-learning is an approach towards interactive learning. Q-Learning is a model free re...
متن کاملLearning reward expectations in honeybees.
The aim of this study was to test whether honeybees develop reward expectations. In our experiment, bees first learned to associate colors with a sugar reward in a setting closely resembling a natural foraging situation. We then evaluated whether and how the sequence of the animals' experiences with different reward magnitudes changed their later behavior in the absence of reinforcement and wit...
متن کاملActive Reward Learning
While reward functions are an essential component of many robot learning methods, defining such functions remains a hard problem in many practical applications. For tasks such as grasping, there are no reliable success measures available. Defining reward functions by hand requires extensive task knowledge and often leads to undesired emergent behavior. Instead, we propose to learn the reward fu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The International Journal of Robotics Research
سال: 2022
ISSN: ['1741-3176', '0278-3649']
DOI: https://doi.org/10.1177/02783649221078031